Lexicon management and standard formats

نویسنده

  • Éric Laporte
چکیده

International standards for lexicon formats are in preparation. To a certain extent, the proposed formats converge with prior results of standardization projects. However, their adequacy for (i) lexicon management and (ii) lexicon-driven applications have been little debated in the past, nor are they as a part of the present standardization effort. We examine these issues. IGM has developed XML formats compatible with the emerging international standards, and we report experimental results on large-coverage lexica.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Generic Architecture for Lexicon Management

In this paper we propose an architecture for a lexicon management tool MANAGELEX. This tool aims at a general environment for reading, updating and combining lexicons in different formats. The starting point is the already existing lexicon models MULTILEX and GENELEX. Each functionality (reading, updating and combining) is based on a corresponding model, which can be configured and maintained c...

متن کامل

Continuous speech recognition in the WAXHOLM dialogue system

This paper presents the status of the continuous speech recognition engine of the WAXHOLM project. The engine is a software only system written in portable C code. The design is flexible and different modes for phonetic pattern matching are available. In particular, artificial neural networks and standard multiple Gaussian mixtures are implemented for phone probability estimation, and for resea...

متن کامل

Computer Tools for the Management of Lexicon-Grammar Databases

Lexicon grammar is a systematic method for the analysis and the representation of the elementary sentence structures of a natural language; its product: large collections of syntactic electronic dictionaries or lexicon-grammar tables (LGTs). In order to describe a language, very long term collaborative work is required. However, the current computer tools for the management of LGTs do not fulfi...

متن کامل

A Computational Lexicon Of Portuguese For Automatic Text Parsing

Using standard methods and formats established at LADL, and adopted by several European research teams to construct largecoverage electronic dictionaries and grammars, we elaborated for Portuguese a set of lexlcal resources, that were implemented in IN'rEX We describe the main features of such linguistic data, refer to their mmntenance and extension, and gwe different examples of automatic text...

متن کامل

Outilex, plate-forme logicielle de traitement de textes écrits

The Outilex software platform, which will be made available to research, development and industry, comprises software components implementing all the fundamental operations of written text processing : processing without lexicons, exploitation of lexicons and grammars, language resource management. All data are structured in XML formats, and also in more compact formats, either readable or bina...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/0711.3449  شماره 

صفحات  -

تاریخ انتشار 2005